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Inspired by the fact that people have diverse propensities to punish wrongdoers, we study a spatial public 
goods game with defectors and different types of punishing cooperators. During the game, cooperators punish 
defectors with class-specific probabilities and subsequently share the associated costs of sanctioning. We show 
that in the presence of different punishing cooperators the highest level of public cooperation is always attainable 
through a selection mechanism. Interestingly, the selection not necessarily favors the evolution of punishers who 
would be able to prevail on their own against the defectors, nor does it always hinder the evolution of punishers 
who would be unable to prevail on their own. Instead, the evolutionary success of punishing strategies depends 
sensitively on their invasion velocities, which in turn reveals fascinating examples of both competition and 
cooperation among them. Furthermore, we show that under favorable conditions, when punishment is not 
strictly necessary for the maintenance of public cooperation, the less aggressive, mild form of sanctioning is 
the sole victor of selection process. Our work reveals that natural strategy selection can not only promote, but 
sometimes also hinder competition among prosocial strategies. 

PACS numbers: 89.75.Fb, 87.23.Ge, 89.65.-s 


I. INTRODUCTION 

Cooperation is vital for the maintenance of public goods in 
human societies fflU. But according to Darwin’s theory of 
evolution, competition rather than cooperation ought to drive 
our actions. The reconciliation of this theory with the fact that 
cooperation is widespread in human societies, as well as with 
the fact that it is much more common in nature as one might 
expect, is one of the most persistent challenges in evolution¬ 
ary biology and social sciences 00. Past decades have seen 
the paradigm of punishment rise as one of the more successful 
strategies by means of which cooperation might be promoted 
CHia. Indeed, punishment is also the principle tool of in¬ 
stitutions in human societies for maintaining cooperation and 
otherwise orderly behavior EHm. However, punishment is 
costly, and as such it reduces the payoffs of both the defectors 
as well as of those that exercise the punishment, hence yield¬ 
ing an overall lower income and acting as a drain on social 
welfare. Thus, understanding the emergence of costly punish¬ 
ment is crucial for the evolution of cooperation Il20l427l . 

While recent research confirms that punishment is often 
motivated by negative personal emotions such as anger or 
disgust 0|28l, Raihani and McAuliffe have shown also that 
the decision to punish is often motivated with the aversion 
of inequity in mind, rather than by the desire for reciprocity 
ll29l . Although prosocial punishment is widespread in nature 
lEolliIl, it is unlikely that cooperators are willing to commit 
permanently to punishing wrongdoers. For that, the action is 
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simply to costly, and hence some form of abstinence is likely, 
also to avoid unwanted retaliation. Several research groups 
have recently investigated these and related up and down sides 
of punishment EtI [32143^ . For example, it was shown that 
cooperators punish defectors selectively depending on their 
current personal emotions, even if the number of defectors 
is large More often than not, however, whether or not 
to punish depends on the whiff of the moment and is thus 
a fairly random event. Motivated by these observations, we 
have recently shown that sharing the effort of punishment in a 
probabilistic manner can significantly lower the vulnerability 
of costly punishment and in fact help stabilize costly altruistic 
strategies EH. 

Here we drop the assumption that cooperators who do pun¬ 
ish defectors do so uniformly at random. Instead, we account 
for the diversity in punishment, taking into account the fact 
that some individuals are more likely to punish, while others 
punish only rarely. More specifically, we introduce different 
threshold levels for punishment, which ultimately introduces 
different classes of cooperators that punish defectors. The as¬ 
sumption of diverse players is not just a realistic hypothesis, 
but in general it is firmly established that it also has a decisive 
impact on the evolution of public cooperation 07414011 . Mo¬ 
tivated by this fact, we therefore study a spatial public goods 
game with defectors and different types of punishing coop¬ 
erators. While previously we have demonstrated the impor¬ 
tance of randomly shared punishment ETl . we here approach 
a more realistic scenario by assuming that each type of coop¬ 
erators will punish with a different probability. Our goal is to 
determine whether a specific class of punishing cooperators 
will be favored by natural selection, or whether despite the 
competition among them synergistic effects will emerge. As 
we will show, the evolution is governed by a counterintuitive 
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selection mechanism, depending further on the synergistic ef¬ 
fects of cooperative behavior. However, before presenting the 
main results in detail, we proceed by a more accurate descrip¬ 
tion of the studied spatial public goods game with different 
punishing strategies. 


II. SPATIAL PUBLIC GOODS GAME WITH DIVERSE 
PUNISHMENT 


We consider a population of individuals who play the pub¬ 
lic goods game on a square lattice of size Lx L with periodic 
boundary conditions. We assume that the game is contested 
between T classes of cooperators (Cq, Ci,.. ., Ct-i) and de¬ 
fectors (D). Independently of the class a cooperator belongs 
to, it contributes an amount c to the common pool, while de¬ 
fectors contribute nothing. After the sum of all contributions 
in the group is multiplied by the enhancement factor r > 1, 
the resulting amount is shared equally among all group mem¬ 
bers. 

Moreover, cooperators with strategy Ci {0 < i < T — 1) 
choose to punish defectors with a probability i/{T — 1) if the 
latter are present. As a result, each defector in the group 
is punished with a fine a, while all the cooperators who 
participated in the punishment equally shared the associated 
costs. In particular, each punishing cooperator bears the cost 
(n — nc)a/np, where nc and np are the number of coop¬ 
erators and punishers in the group, respectively. We empha¬ 
size that a cooperator who decides to punish bears the same 
cost independently of the class it belongs to. Thus, here the 
strategy s = Q only determines how frequently a coopera¬ 
tor is willing to punish defectors. Nevertheless, it is worth 
pointing out that Co never punish and thus correspond to tra¬ 
ditional second-order free-riders because they enjoy the ben¬ 
efits of punishment without contributing to it PITl . On the 
other extreme, cooperators belonging to the Ct-i class pun¬ 
ish always when defectors are present in the group. Since each 
player on site x with von Neumann neighborhood is a member 
of five overlapping groups of size W = 5, in each generation 
it participates in five public goods games and obtains its to¬ 
tal payoff Px = where is the payoff gained from 

group Gj. 

Subsequently, a player x, having strategy Sx, adopts the 
strategy Sy of a randomly chosen neighbor y with the prob¬ 
ability 


f{Sx •<— Sy) 


1 

1 -f exp[(Pa; - Py)/K] ’ 


( 1 ) 


where k denotes the amplitude of noise 14^ . Without loosing 
generality and to ensure continuity of this line of research Il43l 
we set K = 0.5, meaning that it is very likely that the better 
performing players will pass their strategy to their neighbors, 
yet it is also possible that players will occasionally learn from 
a less successful neighbor. To conclude the description of this 
public good game, we would like to emphasize that differ¬ 
ent Ci classes represent different strategies, as our goal is to 
explore how the willingness to punish evolves at specific pa¬ 
rameter values. 
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FIG. 1: (Color online) Fraction of cooperators as a function of the 
punishment fine a and the probability to punish p, as obtained for 
a low multiplication factor r = 3.5 in the original model proposed 
in (m, where a uniform probability to punish was assumed for all 
cooperators. Note that both a and p have a non-monotonous impact 
on the fraction of cooperators. 


The model is studied by means of Monte Carlo simulations. 
Initially, defectors randomly occupy half of the square lattice, 
and each type of cooperators randomly 1/T of the rest of the 
lattice. During one full Monte Carlo step (MCS), all individ¬ 
uals in the population receive a chance once on average to 
adopt another strategy. Depending on the proximity to phase 
transition points and the typical size of emerging spatial pat¬ 
terns, the linear system size was varied from L = 120 to 600 
and the relaxation time was varied from 10^ to 10® MCS to 
ensure proper statistical accuracy. The reported fractions of 
competing strategies were determined in the stationary state 
when their average values became time-independent. Alter¬ 
natively, we have averaged the outcomes over 20 — 100 in¬ 
dependent runs when the system terminated into a uniform 
absorbing state. 


III. RESULTS 


For the sake of comparison, we first present the fraction of 
cooperators in dependence on the punishment fine a and the 
probability to punish p at a low r value, as obtained in the orig¬ 
inal probabilistic punishment model, where cooperators pun¬ 
ish uniformly at random ll27l . Figure[2illustrates that the frac¬ 
tion of cooperators first increases, reaches its maximum, but 
then again decreases, as the values of a and p increase along 
the diagonal on the p — a plane. Increasing one of these pa¬ 
rameters, while the other is kept constant, returns to the same 
observation. Both a and p thus have a non-monotonous im¬ 
pact on the fraction of cooperators, which is closely related 
with the fact that a characterizes not only the level of pun¬ 
ishment but also its cost. Accordingly, too high values of a 
involve too high costs stemming from the act of punishing. It 
is worth pointing out that r = 3.5, which is used in Fig.[T] is 
a relatively low value of the multiplication factor at which the 
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FIG. 2: (Color online) Panel (a) shows the fraction of different co- 
operator classes in the final state in dependence of a when they start 
fighting with defectors simultaneously. Panel (c) shows an enlarged 
part of panel (a) at low a values, when cooperation becomes dom¬ 
inant over defection. To present the overall level of cooperation in 
the population, the cumulative fraction of Ci strategies is also shown 
(denoted by C). For comparison, in panel (b) we have also plotted 
the resulting fraction of cooperator classes when they fight against 
defectors individually. As in panel (c), panel (d) shows an enlarged 
part of panel (b) at a specific interval of a. The multiplication factor 
in all panels is r = 3.5. 


non-monotonous dependence can still be observed. In com¬ 
parison with the results obtained for larger values of r as used 
in Ref. 1221, however, the current plot features a significantly 
narrower p region where full cooperation is possible when a is 
sufficiently large. Similarly, there is a limited region of inter¬ 
mediate a values where cooperators that punish severely can 
beat defectors. Based on these observations, in the present 
model we thus explore if there is an evolutionary selection 
among different punishing strategies as they compete against 
the defectors simultaneously, or if there is indeed cooperation 
in the common goal to deter defectors. 

For an intuitive overview, we set T = 6 and investigate how 
the six types of punishing strategies compete and potentially 
cooperate with each other in the presence of defectors. The 
general conclusion, however, is robust and remains valid if 
we use other values of T. Using the same r = 3.5 as in Fig.[T] 
the panels of Fig. [^summarize our main findings. The first 
panel shows the fractions of strategies in the final state in de¬ 
pendence of the punishment fine a when different punishing 
strategies fight against defectors simultaneously. For clarity, 
we have also plotted the accumulated fraction of punishing 
strategies. In contrast to the uniform punishing model, we 
can see that the total fraction of cooperators should increase 


monotonously with increasing a. As Fig. |^a) illustrates, co- 
operators can survive when a > 0.40, and become dominant 
over a > 0.42 (see also the enlarged part in Fig. [^c)). We 
should stress, however, that not all types of cooperators can 
survive at equilibrium, even if cooperators take over the whole 
population. It turned out that there are some “weak” classes 
of cooperators who go extinct before defectors die out, while 
other classes of cooperators survive. 

For a more in-depth explanation, the vitality of punishing 
classes can be estimated if we let them fight against defectors 
individually. The outcomes of this scenario are summarized 
in Fig.|^b). Results presented in this panel suggest that there 
are punishing classes who can dominate for all high a values, 
while others become vulnerable as we increase a. More inter¬ 
estingly, however, there are mildly punishing strategies who 
can survive only due to the support of the more successful 
punishing strategies. For example, for a — 0.42 classes C 5 
and C4 can outperform defectors, while C3 disappear when 
they fight against defectors individually [Fig. |^b) and (c)]. 
But when all punishing strategies are on the stage then C 3 
players can survive as well. This effect is more spectacular 
for the second-order free riding Cq class, who would die out 
immediately at such a low synergy factor r if they face defec¬ 
tors alone. But now, especially at high a values, their ratio 
becomes considerable. This indicates that some less viable 
classes of cooperators can survive because of the support of 
more viable punishing strategies via an evolutionary selection 
mechanism which has a biased impact on the evolution of oth¬ 
erwise competing strategies. 

To demonstrate the underlying mechanism behind the 
above observations, we present a series of snapshots of strat¬ 
egy evolutions starting from different prepared initial states. 
The comparative analysis is plotted in Fig. where all runs 
were obtained for a = 0.42 and r = 3.5. In the first row, 
we demonstrate how the class of C 3 punishing strategy can 
prevail over defectors. Initially, only a tiny portion of C 5 co- 
operators is launched in the sea of defectors [the fraction of 
C 3 is 8 %, see panel (a)]. Still, C 3 cooperators can expand 
gradually and invade the whole available territory [shown in 
panels from (a2) to (a4)]. The second row, which was taken 
at the same parameter values, demonstrates clearly the vul¬ 
nerability of the C3 class against defectors. Despite of the 
fact that they occupy the majority of the available room at 
the beginning, shown in panel (bl), still, they will be gradu¬ 
ally crowded out by defector players. The final state, shown 
in panel (b4), highlights that such a rare punishment activity 
represented by C3 class is ineffective against defectors at the 
applied synergy factor r. The third row, where all previously 
mentioned strategies are present at the beginning, illustrates a 
completely different scenario. Here we start from a balanced 
initial state where half of the lattice sites is occupied by C3 
and C 3 strategies, while the other half is filled by defectors. 
As panels (cl) to (c4) illustrate, defectors will gradually go 
extinct while “weak” C3 cooperators survive and occupy al¬ 
most half of the available territory in the final state. We note 
that there is a neutral drift between punishing strategies in the 
absence of defectors, which will result in a homogeneous state 
where the probability to arrive to one of the possible final des- 
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FIG. 3: (Color online) Evolution of typical spatial patterns, as 
obtained for four different prepared initial conditions when using 
a = 0.42 and r = 3.5. The first row shows the case when just 
a few Cs cooperators are initially present among defectors. It can 
he observed that even under such unfavorable initial conditions the 
Cs strategy can successfully outperform defectors. The second row 
feature a similar experiment with the Gs strategy, which fails to sur¬ 
vive among defectors even though the latter are initially in minority. 
The third row illustrates cooperation among strategies Gs and Gs, 
which together dominate the whole population even though Ga alone 
would fail under the same conditions (see second row). We note that 
a neutral drift starts when defectors die out, as explained in the main 
text. The fourth row demonstrates, however, that the cooperation 
among different punishing strategies illustrated in the third row is 
rather fragile. If initially the strategy Ga is replaced by strategy Ga, 
then the later simply die out and subsequently the whole evolution 
becomes identical to the one shown in the first row, where strategy 
Gs alone outperforms all defectors. For clarity, here the employed 
system size is small with just L x L = 100 x 100 players. 


tinations is proportional to the initial portion of a specific class 
at the time defectors die out 1441 . This evolutionary outcome 
indicates that although G 3 players are, as an isolated strategy, 
weak against defector players, they can nevertheless survive 
because of the assistance of the strong C 5 strategy even if the 
initial fraction of the later is modest. In the fourth row, how¬ 
ever, when we arrange a similar setup but replaced weak G 3 
players with also weak C 2 players, the final state will always 
be the full C 5 state. Here, the presence of strong C 5 players 
does not yield a relevant support to C 2 players who therefore 
die out, and subsequently the system returns to the scenario 
illustrated in panels (al) to (a4). 



FIG. 4: (Color online) Individual competition of three different pun¬ 
ishing strategies, namely G 2 , G 3 and G 5 , against defectors in de¬ 
pendence on time. Note that initially only one cooperative strategy 
and defectors are present, using the same initial conditions as illus¬ 
trated in Fig.j^ Positive value of pc^ — Pd indicates the invasion 
of cooperator strategy while its negative value suggests invasion to 
the reversed direction. Note that while both G 2 and G 3 strategies 
ultimately loose their battle, the latter is able to prevail significantly 
longer. This enables an effective help of strategy Gs when they com¬ 
pete against defectors together, as illustrated in panels (cl) to (c4) in 
Fig.[3] 


The key point, which explains the significantly different tra¬ 
jectories for mildly punishing strategies is based on the dif¬ 
ference of invasion velocities between the competing strate¬ 
gies. To demonstrate the importance of invasion velocities, we 
monitor how the fraction of strategies evolves in time when 
we launch the system from a two-strategy state where both 
strategies form compact domains. Following the previously 
applied approach illustrated in Fig. we compare the strat¬ 
egy invasions between C 2 — D,C^ — D, and between C^ — D 
strategies. The comparison of these different cases is plotted 
in Fig. As expected, both C 2 and C 3 loose the lonely fight 
against defectors, while G 5 will eventually crowd out defec¬ 
tors. Note that there is only a very slight increase during the 
early stages of the evolutionary process that can be observed 
for all cases, independently of the final outcome. This is be¬ 
cause straight initial interfaces can provide a strong tempo¬ 
rary phalanx for every punishing strategy. Nevertheless, when 
this interface becomes irregular due to invasions the individ¬ 
ual weakness of C 2 and C 3 strategies reveals itself. Still, there 
is a significant difference between their trajectories. Namely, 
strategy C 3 is able to resist for a comparatively long time, 
which gives strategy C 5 enough time to crowd out defectors. 
On the other hand, strategy C 2 is a too easy prey for defec¬ 
tors, which is why they die out faster than the strategy C 5 is 
able to eliminate all defectors. Ultimately thus, strategy C 3 
can benefit from cooperation with strategy C 5 , while strategy 
C 2 is unable to do the same. 

In the remainder of this work, we focus on the parameter 
region where cooperators are able to coexist with defectors 
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FIG. 5: (Color online) Stationary fractions of different cooperator classes in dependence on a when they compete against defectors simultane¬ 
ously [panel (a)] and individually [panel (b)]. The cumulative fraction of all punishing strategies (denoted as C in the legend) is also plotted. 
The multiplication factor in both panels is r = 4.0, which enables pure cooperators (Co) to coexist with defectors even in the absence of 
punishment. 


without applying punishment. Namely, if the synergy factor 
exceeds r > 3.74, then pure cooperators (cooperators that do 
not punish) can survive permanently alongside defectors due 
to network reciprocity ll43l . Evidently, the presence of pun¬ 
ishers can of course still elevate the overall cooperation level 
and defectors can be effectively crowded out from the popu¬ 
lation 123. Here the main question is thus how the different 
punishing strategies will share the available space. 

The results are summarized in the left panel of Fig. as 
obtained for the representative value of r = 4.0. It can 
be observed that, when all the different types of punishing 
strategies hght against defectors simultaneously, then coop¬ 
erators can dominate the whole population above a thresh¬ 
old value a > 0.25. However, to evaluate these hnal out¬ 
comes adequately, we need to know the individual relations 
between each particular cooperative strategy and defectors on 
a strategy-versus-strategy basis. Therefore, as for the previ¬ 
ously presented low r case inFig.|2j in the right panel of Fig. 
we also show the stationary fractions of different cooperators 
classes when they compete against defectors individually. Re¬ 
sults presented in panel (b) highlight that too large a values 
could be detrimental for the C 3 , C 4 and the C 5 strategy. This 
is the so-called “punish, but not too hard” effect, where too 
large costs of sanctioning do more damage to those that ex¬ 
ecute punishment than the imposed hnes do damage to the 
defectors 0 . A direct comparison with the results presented 
in panel (a) demonstrates clearly that we can observe a similar 
cooperation among punishing strategies as we have reported 
before for the low r case, in particular because all the men¬ 
tioned mildly punishing strategies can survive even at a high 
a value. 

On the other hand, a conceptually different mechanism can 
be observed in the small a region, which is reminiscent of 
what one would actually expect from a selection process. 
More specihcally, panel (a) of Fig. shows that at a « 0.2 
only strategy C 3 survives and coexists with D while all the 
other punishing strategies die out. The latter players are those, 


who could survive individually with defectors but should die 
out because of the presence of a more effective (C 3 ) strat¬ 
egy. Interestingly, the mentioned selection mechanism can 
work most efficiently when the leading strategy is less effi¬ 
cient against defectors. Right panel of Fig. shows that C 3 
would be unable to crowd out strategy D at these a values, 
while a Z?-free state could be obtained at higher a value. In 
the latter case, when C 3 is too powerful, then this strategy 
beats defectors too fast which allows other punishing strate¬ 
gies to survive: this is similar to what we have observed in the 
third row of Fig. But when C 3 is less effective at smaller 
a values then the presence of surviving D players enables C 3 
players to play out their superior efficiency if comparing to 
other punishing strategies. Thus, depending on the key pa¬ 
rameter values, most prominently the multiplication factor r 
and the punishment fine a, the different punishing strategies 
can either cooperate with each other or compete against each 
other in the spatial public goods game. 


IV. CONCLUSION 

We have introduced and studied multiple types of punishing 
strategies that sanction defectors with different probabilities. 
The fundamental question that we have addressed is whether 
there exists a selection mechanism which would result in an 
unambiguous victor when these strategies compete against de¬ 
fectors. We have shown that the answer to this question de¬ 
pends sensitively on the external conditions, in particular on 
the value of the multiplication parameter r. If the public goods 
game is demanding due to a low value of r, then the pure 
payoff-driven individual selection provides a helping hand to 
those punishing strategies that would be unable to survive in 
an individual competition against defectors. In particular, we 
have demonstrated that the failure or success of a specihc pun¬ 
ishing strategy could depend sensitively on the relation of in¬ 
vasion velocities between specihc punishing strategies and the 
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defectors. Accordingly, if the loosing punishing strategy can 
delay the complete victory of defectors sufficiently long, then 
a more successful punishing strategy has a chance to wipe out 
defectors first. This is an example of the cooperation between 
different punishing strategies. 

On the other hand, in a less demanding environment, char¬ 
acterized by a higher multiplication factor, a different kind of 
relation can emerge. While the previously summarized co¬ 
operation between punishing strategies is still possible, there 
also exist parameter regions where competition is the dom¬ 
inant mode, and indeed there is always a single and unam¬ 
biguous victor among the different classes of punishers. In¬ 
terestingly, we have shown that this happens when the fittest 
punishing strategy is not effective enough to beat defectors 
completely. Instead, by carefully taming the defectors, they 
help to reveal the advantages of other punishing strategies. 
As we have shown, the key point here is again the relation 
between the invasion velocities. Namely, a too intensive in¬ 
vasion will decimate defectors too fast and the advantage of 


specific punishing classes will remain forever hidden. There¬ 
fore, in contrast to intuitive expectation, the social diversity of 
cooperators in terms of their relations with defectors could be 
the result of an effective selection mechanism. We hope that 
this research will contribute relevantly to our understanding 
of the emergence of diversity among competing strategies, as 
well to their role in determining the ultimate fate of the popu¬ 
lation. 
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